Skip to content

fix(hypershift/gcp): correct DNS zone name and surface cleanup errors#76993

Merged
openshift-merge-bot[bot] merged 2 commits intoopenshift:mainfrom
cristianoveiga:fix/dns-zone-name
Mar 30, 2026
Merged

fix(hypershift/gcp): correct DNS zone name and surface cleanup errors#76993
openshift-merge-bot[bot] merged 2 commits intoopenshift:mainfrom
cristianoveiga:fix/dns-zone-name

Conversation

@cristianoveiga
Copy link
Copy Markdown
Contributor

@cristianoveiga cristianoveiga commented Mar 27, 2026

Summary

  • The e2e-gke workflow had HYPERSHIFT_GCP_CI_DNS_ZONE set to hypershift-ci-zone but the actual zone in gcp-hcp-hypershift-ci is hypershift-ci-gcp-hcp-openshiftapps-com. This caused the deprovision step's DNS cleanup to silently fail, leaving orphaned DNS records.
  • The gcloud dns record-sets list command had 2>/dev/null || true which swallowed permission errors (403 Forbidden), making it appear that no DNS records existed. Replaced with explicit error handling that logs failures.
  • Note: DNS cleanup also requires roles/dns.admin for the hypershift-ci service account — see openshift-online/gcp-hcp-infra#429.

Test plan

  • Verify gcloud dns record-sets list --zone=hypershift-ci-gcp-hcp-openshiftapps-com --project=gcp-hcp-hypershift-ci works with the corrected zone name
  • After gcp-hcp-infra#429 is applied, re-rehearse and confirm DNS cleanup succeeds in deprovision logs

🤖 Generated with Claude Code

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 27, 2026
@openshift-ci openshift-ci Bot requested review from cblecker and ckandag March 27, 2026 17:45
@cristianoveiga
Copy link
Copy Markdown
Contributor Author

/pj-rehearse pull-ci-openshift-hypershift-main-e2e-gke

@cristianoveiga
Copy link
Copy Markdown
Contributor Author

cristianoveiga commented Mar 27, 2026

Note: The e2e-gke rehearsal is expected to fail — this is a known, pre-existing issue unrelated to this PR.

Root cause: controlPlaneVersion stays Partial because cluster-network-operator can't complete its rollout. The cloud-network-config-controller pod is stuck in PodInitializing due to a missing secret (cloud-network-config-controller-creds). This secret is created for AWS/Azure/OpenStack but not yet for GCP.

Why it fails now: hypershift#7887 added a controlPlaneVersion status check that requires all components to reach RolloutComplete: True. Before that PR, the missing secret was invisible to the test.

Fix: hypershift#7824 (GCP-431: Add CNCC support for GCP WIF) adds the missing secret for GCP but is not yet merged. The e2e-gke job will fail until that lands.

What this rehearsal validates: The DNS zone name correction — the deprovision step should now successfully list and delete DNS records from zone hypershift-ci-gcp-hcp-openshiftapps-com instead of silently failing against the non-existent hypershift-ci-zone. Check the hypershift-gcp-gke-deprovision step logs for DNS cleanup output.

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@cristianoveiga: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@cristianoveiga
Copy link
Copy Markdown
Contributor Author

/retest rehearse-76993-pull-ci-openshift-hypershift-main-e2e-gke

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Mar 27, 2026

@cristianoveiga: The /retest command does not accept any targets.
The following commands are available to trigger required jobs:

/test app-ci-config-dry
/test boskos-config
/test boskos-config-generation
/test build01-dry
/test build02-dry
/test build03-dry
/test build04-dry
/test build05-dry
/test build06-dry
/test build07-dry
/test build08-dry
/test build09-dry
/test build10-dry
/test build11-dry
/test check-gh-automation
/test check-gh-automation-tide
/test ci-operator-config
/test ci-operator-config-metadata
/test ci-operator-registry
/test ci-secret-bootstrap-config-validation
/test ci-testgrid-allow-list
/test clusterimageset-validate
/test config
/test core-ci-config-dry
/test core-valid
/test generated-config
/test generated-dashboards
/test hosted-mgmt-dry
/test image-mirroring-config-validation
/test jira-lifecycle-config
/test labels
/test openshift-image-mirror-mappings
/test ordered-prow-config
/test owners
/test pr-reminder-config
/test prow-config
/test prow-config-filenames
/test prow-config-semantics
/test pylint
/test release-config
/test release-controller-config
/test rover-groups-config-validation
/test secret-generator-config-valid
/test services-valid
/test stackrox-stackrox-stackrox-stackrox-check
/test step-registry-metadata
/test step-registry-shellcheck
/test sync-rover-groups
/test verified-config
/test vsphere02-dry
/test yamllint

The following commands are available to trigger optional jobs:

/test check-cluster-profiles-config

Use /test all to run the following jobs that were automatically triggered:

pull-ci-openshift-release-main-ci-operator-config
pull-ci-openshift-release-main-ci-operator-registry
pull-ci-openshift-release-main-core-valid
pull-ci-openshift-release-main-owners
pull-ci-openshift-release-main-release-controller-config
pull-ci-openshift-release-main-step-registry-metadata
pull-ci-openshift-release-main-step-registry-shellcheck
pull-ci-openshift-release-openshift-image-mirror-mappings
pull-ci-openshift-release-yamllint
Details

In response to this:

/retest rehearse-76993-pull-ci-openshift-hypershift-main-e2e-gke

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@cristianoveiga
Copy link
Copy Markdown
Contributor Author

/test rehearse-76993-pull-ci-openshift-hypershift-main-e2e-gke

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Mar 27, 2026

@cristianoveiga: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

/test app-ci-config-dry
/test boskos-config
/test boskos-config-generation
/test build01-dry
/test build02-dry
/test build03-dry
/test build04-dry
/test build05-dry
/test build06-dry
/test build07-dry
/test build08-dry
/test build09-dry
/test build10-dry
/test build11-dry
/test check-gh-automation
/test check-gh-automation-tide
/test ci-operator-config
/test ci-operator-config-metadata
/test ci-operator-registry
/test ci-secret-bootstrap-config-validation
/test ci-testgrid-allow-list
/test clusterimageset-validate
/test config
/test core-ci-config-dry
/test core-valid
/test generated-config
/test generated-dashboards
/test hosted-mgmt-dry
/test image-mirroring-config-validation
/test jira-lifecycle-config
/test labels
/test openshift-image-mirror-mappings
/test ordered-prow-config
/test owners
/test pr-reminder-config
/test prow-config
/test prow-config-filenames
/test prow-config-semantics
/test pylint
/test release-config
/test release-controller-config
/test rover-groups-config-validation
/test secret-generator-config-valid
/test services-valid
/test stackrox-stackrox-stackrox-stackrox-check
/test step-registry-metadata
/test step-registry-shellcheck
/test sync-rover-groups
/test verified-config
/test vsphere02-dry
/test yamllint

The following commands are available to trigger optional jobs:

/test check-cluster-profiles-config

Use /test all to run the following jobs that were automatically triggered:

pull-ci-openshift-release-main-ci-operator-config
pull-ci-openshift-release-main-ci-operator-registry
pull-ci-openshift-release-main-core-valid
pull-ci-openshift-release-main-owners
pull-ci-openshift-release-main-release-controller-config
pull-ci-openshift-release-main-step-registry-metadata
pull-ci-openshift-release-main-step-registry-shellcheck
pull-ci-openshift-release-openshift-image-mirror-mappings
pull-ci-openshift-release-yamllint
Details

In response to this:

/test rehearse-76993-pull-ci-openshift-hypershift-main-e2e-gke

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@cristianoveiga
Copy link
Copy Markdown
Contributor Author

/pj-rehearse pull-ci-openshift-hypershift-main-e2e-gke

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@cristianoveiga: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@cristianoveiga cristianoveiga changed the title fix(hypershift/gcp): correct DNS zone name in e2e-gke workflow fix(hypershift/gcp): correct DNS zone name and surface cleanup errors Mar 27, 2026
@cristianoveiga
Copy link
Copy Markdown
Contributor Author

/pj-rehearse pull-ci-openshift-hypershift-main-e2e-gke

@cristianoveiga
Copy link
Copy Markdown
Contributor Author

/retest

@cristianoveiga
Copy link
Copy Markdown
Contributor Author

/pj-rehearse pull-ci-openshift-hypershift-main-e2e-gke

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@cristianoveiga: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

The e2e-gke workflow had HYPERSHIFT_GCP_CI_DNS_ZONE set to
"hypershift-ci-zone" but the actual zone is
"hypershift-ci-gcp-hcp-openshiftapps-com". This caused the deprovision
step's DNS cleanup to silently fail.

Additionally, the gcloud dns list command had 2>/dev/null || true which
swallowed permission errors (403 Forbidden), making it appear that no
DNS records existed. Replace with explicit error handling that logs
failures instead of hiding them.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@cristianoveiga
Copy link
Copy Markdown
Contributor Author

/pj-rehearse pull-ci-openshift-hypershift-main-e2e-gke

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@cristianoveiga: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@cristianoveiga
Copy link
Copy Markdown
Contributor Author

/pj-rehearse pull-ci-openshift-hypershift-main-e2e-gke

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@cristianoveiga: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Mar 28, 2026

@cristianoveiga: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/rehearse/openshift/hypershift/main/e2e-gke 85703c0 link unknown /pj-rehearse pull-ci-openshift-hypershift-main-e2e-gke

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

DNS cleanup failures were logged as warnings but the step still exited
0, making orphaned DNS records invisible. Since the step has
best_effort: true, failing it won't block the job but will surface
the issue in the Prow UI.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

[REHEARSALNOTIFIER]
@cristianoveiga: the pj-rehearse plugin accommodates running rehearsal tests for the changes in this PR. Expand 'Interacting with pj-rehearse' for usage details. The following rehearsable tests have been affected by this change:

Test name Repo Type Reason
pull-ci-openshift-hypershift-main-e2e-gke openshift/hypershift presubmit Registry content changed
pull-ci-openshift-hypershift-release-5.0-e2e-gke openshift/hypershift presubmit Registry content changed
pull-ci-openshift-hypershift-release-4.23-e2e-gke openshift/hypershift presubmit Registry content changed
pull-ci-openshift-hypershift-release-4.22-e2e-gke openshift/hypershift presubmit Registry content changed
Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

@jimdaga
Copy link
Copy Markdown
Contributor

jimdaga commented Mar 30, 2026

/lgtm

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Mar 30, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Mar 30, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cristianoveiga, jimdaga

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@cristianoveiga
Copy link
Copy Markdown
Contributor Author

/pj-rehearse ack

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@cristianoveiga: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-ci-robot openshift-ci-robot added the rehearsals-ack Signifies that rehearsal jobs have been acknowledged label Mar 30, 2026
@openshift-merge-bot openshift-merge-bot Bot merged commit 2771a50 into openshift:main Mar 30, 2026
10 checks passed
acornett21 pushed a commit to acornett21/openshift-release that referenced this pull request Apr 3, 2026
…openshift#76993)

* fix(hypershift/gcp): correct DNS zone name and surface cleanup errors

The e2e-gke workflow had HYPERSHIFT_GCP_CI_DNS_ZONE set to
"hypershift-ci-zone" but the actual zone is
"hypershift-ci-gcp-hcp-openshiftapps-com". This caused the deprovision
step's DNS cleanup to silently fail.

Additionally, the gcloud dns list command had 2>/dev/null || true which
swallowed permission errors (403 Forbidden), making it appear that no
DNS records existed. Replace with explicit error handling that logs
failures instead of hiding them.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(hypershift/gcp): fail deprovision step on DNS cleanup errors

DNS cleanup failures were logged as warnings but the step still exited
0, making orphaned DNS records invisible. Since the step has
best_effort: true, failing it won't block the job but will surface
the issue in the Prow UI.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. rehearsals-ack Signifies that rehearsal jobs have been acknowledged

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants